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Ensembles of networks are used as null models in many applications. However, simple null models 
often show much less clustering than their real-world counterparts. In this paper, we study a model 
where clustering is enhanced by means of a fugacity term as in the Strauss (or "triangle") model, 
but where the degree sequence is strictly preserved - thus maintaining the quenched heterogeneity 
of nodes found in the original degree sequence. Similar models had been proposed previously in 
[R. Milo et al, Science 298, 824 (2002)]. We find that our model exhibits phase transitions as 
the fugacity is changed. For regular graphs (identical degrees for all nodes) with degree k > 2 we 
find a single first order transition. For all non-regular networks that we studied (including Erdos - 
Renyi and scale-free networks) we find multiple jumps resembling first order transitions, together 
with strong hysteresis. The latter transitions are driven by the sudden emergence of "cluster cores": 
groups of highly interconnected nodes with higher than average degrees. To study these cluster cores 
visually, we introduce q-clique adjacency plots. We find that these cluster cores constitute distinct 
communities which emerge spontaneously from the triangle generating process. Finally, we point 
out that cluster cores produce pitfalls when using the present (and similar) models as null models 
for strongly clustered networks, due to the very strong hysteresis which effectively leads to broken 
ergodicity on realistic time scales. 



I. INTRODUCTION 

Networks are an essential tool for modeling complex 
systems. The nodes of a network represent the compo- 
nents of the system and the links between nodes represent 
interactions between those components. Networks have 
been applied fruitfully to a wide variety of social [TJ [2], 
technological [3], and biological [4] systems. Many net- 
work properties have been studied to discover how differ- 
ent functional or generative constraints on the network 
influence the network's structure. In this paper we exam- 
ine five properties of particular importance: the degree 
sequence [5], which counts the number of nodes in the 
network with k links; the clustering coefficient [B], which 
measures the tendency of connected triples of nodes to 
form triangles; the number of q-cliques, i.e. complete 
subgraphs with q nodes; the assortativity [7J, which mea- 
sures the tendency of nodes to connect to other nodes of 
similar degree; and the modularity [8], which measures 
the tendency of nodes in the network to form tightly in- 
terconnected communities. Their formal definitions are 
recalled in Sec. 2. 

Models of network ensembles are of interest because 
they formalize and guide our expectations about real- 
world networks and their properties [9] . The most famous 
are the Erdos-Renyi model of random networks |10j , and 
the scale-free Barabasi- Albert model [TT]. Comparison 
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with an a priori realistic "null" model can also indicate 
which features of a real network are expected based on 
the null model features, and which are surprising and 
thus of interest, as in motif search [IS]. In the latter 
context, the most popular ensemble is the configuration 
model [13 and related variants [TJ], in which all networks 
with a given number of nodes and a given degree sequence 
have the same weight. One problem of the configuration 
model is that it shows far too little clustering; this prob- 
lem is especially important when the model is applied to 
motif search in, e.g., protein interaction networks |15j . 

A model where clustering can be enhanced by means 
of a fugacity term in a network Hamiltonian was intro- 
duced by D. Strauss [16] and studied in detail in [17]. In 
the Strauss model, the density of edges is also controlled 
by a second fugacity. Thus it is a generalization of the 
Erdos-Renyi model with fixed edge probability, not with 
fixed edge number. In the Strauss model there is a strong 
first order phase transition |17j from a phase with weak 
clustering to a phase where nearly all edges condensate in 
a single densely connected cluster consisting of high de- 
gree nodes. This phase transition is often seen as a flaw, 
as it does not allow the intermediate clustering observed 
in most real networks |37j . 

In the present paper, we introduce and analyze the 
Biased Rewiring Model (BRM). As in the configuration 
model, we fix the exact degree sequence accounting for 
quenched heterogeneity in node properties. But as in the 
Strauss model, we control the average number of closed 
triangles by a Hamiltonian 18J containing a conjugate 
fugacity (3. By fixing the degree sequence we prevent 
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the extreme condensation of edges typical of the Strauss 
model, and we might a priori hope to achieve a smooth 
control of the clustering. Indeed, a very similar model, 
but with a slightly different Hamiltonian, had been pro- 
posed in [T^] . 

To our surprise we found that this is not the case, and 
the clustering cannot be smoothly controlled. To search 
for phase transitions, we plotted several characteristics 
(number of triangles, number of q-cliques with q = 4 and 
5, assortativity, and modularity) against (3. In all these 
plots and for all non-regular graphs (i.e. graphs with a 
non-trivial degree distribution) we found several jumps 
which look like first order phase transitions (or large 
Barkhausen jumps in fcrromagnets [TH1 HH1 121] • Asso- 
ciated with these jumps are important hysteresis effects. 
Further, we found that high degree nodes play a crucial 
role in generating these phase transitions. It is thus not 
surprising that a somewhat simpler scenario holds for 
regular graphs (same degree k for all nodes), where we 
found a single phase transition for all k > 2. The only 
case where we found no transition at all is that of reg- 
ular graphs with k = 2. Unfortunately it is only in the 
last, somewhat trivial, case that we can do exact analytic 
calculations. In all other cases our results are based on 
simulations. 

In [12], the Hamiltonian was chosen to bias not to- 
wards a larger number of triangles, but towards a spe- 
cific number. In order to achieve this reliably, one needs 
a fugacity which is larger than that in the BRM. In the 
limit of large fugacities this is similar to a model with a 
hard constraint. In general, statistical models with hard 
constraints show slower relaxation and worse ergodic be- 
havior than models with soft constraints (22]. We ex- 
pect thus that hysteresis effects might be even more pro- 
nounced in the model of [T^] and might render it less 
useful as a null model, even if the problem of phase tran- 
sitions is hidden. For simplicity we shall in the following 
call the model of [T^] "triangle conserving" , although the 
name is not strictly correct. We find that for triangle 
conserving rewiring, important structures remain largely 
unchanged on extremely long time scales, requiring par- 
ticular care when using the method. In general, phase 
transitions, strong hysteresis, and persistent structures 
of highly connected nodes together present substantial 
pitfalls for null-models of clustered networks. 

In the next section we shall collect some basic back- 
ground information, including the precise definitions of 
the model with unbiased rewiring and the Strauss model. 
The definition of the BRM and our numerical proce- 
dure is given in Sec. II. F. Our main results are found 
in Sees. III. A to III.C, while some results for the model 
with hard constraints of Milo et al. |12j are presented in 
Sec. III.D Finally, Sec. IV contains our conclusions. 



II. BACKGROUND 

A. Degree sequences 

The degree of a node is the number of links in which 
the node participates. The network's degree sequence 
{rife | k = 0, 1 . . . fc max } counts the number of nodes in 
the network which degree k. The networks studied in 
this paper are regular (n^ = £fe,k ), Erdos-Renyi (poisso- 
nian rik), and several real world networks with fat tails. 
Network properties often depend strongly on the degree 
sequence [5] . Thus real networks are often compared with 
null models which preserve the degree sequence. 

B. Clustering coefficient and g-cliques 

Three nodes are connected, if at least two of the three 
possible links between them exits. If all three links exist, 
they form a triangle. The clustering coefficient |23j mea- 
sures the "transitivity" of relationships in the network, 
i.e. the probability that three connected nodes are also 
a triangle. Denoting the number of triangles by n& and 
the degree of node i by fej, one has 

C = isr N it tuT' ^ 

2 2-ii=\\ K i 1 ) K i 

If every relationship in the network is transitive, C = 1; 
if no relationships are transitive, C = 0. Note that the 
denominator of equation [T] depends only on the degree 
sequence, and thus C oc tia in any ensemble with fixed 
degrees. 

In addition to C, we can also define similar higher order 
clustering coefficients based on q-cliques, i.e. on complete 
subgraphs with q nodes, as 

^ _ Q n q-clique /„s 
lui=l \q-l) 

where n q - c n quc is the number of q— cliques in the network. 
Notice that C — C3. As we shall see, we can use any C q 
as an order parameter in the phase transitions discussed 
below. 



C. Assortativity 

The assortativity r measures the tendency for nodes 
in the network to be linked to other nodes of a similar 
degree. It is defined as the Pearson correlation coefficient 
between the degrees of nodes which are joined by a link 
0- 

£Ef =1 i?-[Ef =1 id 2 

Here L is the number of links in the network and ji and 
ki are the degrees of nodes at each end of link i. Thus, 
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if high degree nodes are linked exclusively to other high 
degree nodes, r ~ 1. If high degree nodes are exclusively 
linked to low degree nodes, r » — 1. 

D. Modularity 

There are many methods for identifying community 
structure in complex networks |24j . each with its own 
strengths and drawbacks. We shall use a measure pro- 
posed by Newman and Girvan [5] called modularity. As- 
sume one has a given partition of the network into k 
non-overlapping communities. Define as the fraction 
of all edges which connect a node in community i to a 
node in community j. Thus = Y]j eij is the fraction of 
all links which connect to community i. The modularity 
of the partition is then defined as: 

Q = 5> 2l -a 4 2 ), (4) 

i 

and the modularity of the network is the maximum of Q 
over all partitions. Q measures the fraction of 'internal' 
links, versus the fraction expected for a random network 
with the same degree sequence. It is large when commu- 
nities are largely isolated with few cross links. 

The main problem in computing Q for a network is 
the optimization over all partitions, which is usually done 
with some heuristics. The heuristics used in the present 
paper is a greedy algorithm introduced by Newman |25| . 
We start with each node in its own community (i.e., all 
communities are of size 1). Joining two communities i 
and j would produce a change SQij. All pairs are 
checked, and the pair with the largest 5Qij is joined. 
This is repeated until all 5Qij are negative, i.e. until Q is 
locally maximal. We follow the efficient implementation 
of this method described by Clauset et al [26] . 

E. Exponential Network Ensembles and Network 
Hamiltonians 

Let us assume that Q is a set of graphs (e.g. the set of 
all graphs with fixed number N of nodes, or with fixed 
N and fixed number L of links, or with fixed N and 
fixed degree sequence, ...), and G e Q. Following [TB], a 
network Hamiltonian H(G) is any function defined on Q, 
used to define an exponential ensemble (analogous to a 
canonical ensemble in statistical mechanics) by assigning 
a weight 

P{G) cx e~ H(G) (5) 

to any graph, similar to the Boltzmann-Gibbs weight. 

Examples of exponential ensembles are the Erdos- 
Renyi model G(N,p) where H = — Lln[p/(1 — p)] and 
the Strauss model with 

^Strauss = 0L- /3n A . (6) 



Here, p (which is not to be confused with P(G)) is the 
probability that a link exists between any two nodes, 
while 9 and (3 are "fugacities" conjugate to L and n A , 
respectively. 

In the configuration model, Q is the set of all graphs 
with a fixed degree sequence and H = 0. Thus all graphs 
have the same weight. In contrast, in the "triangle con- 
serving" biased model of Milo et al. [H] Q is again the 
set of graphs with fixed degree sequence, but 

#Miio = (9|«A - nA,o|. (7) 

where tt-a.o is some target number of triangles, usually 
the number found in an empirical network. Finally, in 
the BRM, Q is again the same but 

ffBRM = -/3n A - (8) 

Thus, while large weights are given in the BRM (with 
(3 > 0) to graphs with many triangles (high clustering), 
in the model of [12] the largest weights are given to graphs 
with n A = ?^a,o- 



F. Simulations: Rewiring 

Simulations of these ensembles are most easily done 
by the Markov chain Metropolis-Hastings method [22] . 
This is particularly easy for models without fixed degree 
sequences, e.g. the Strauss model. There, new configu- 
rations are simply generated by randomly adding or re- 
moving links. This is not possible for the ensembles with 
fixed degree sequences, where the most natural method is 
rewiring [14] . We will first discuss the unbiased case (the 
configuration model), and then discuss the two biased 
cases Hmuo and H BRM . 



1. Unbiased Rewiring 

Starting from a current graph configuration G, a new 
graph G' is proposed as follows: Two links which have 
no node in common are chosen at random, e.g. X — Y 
and W — Z. Links are then swapped randomly either to 
X—W and Y — Z, or to X—Z and Y—W. If this leads 
to a double link (i.e. one or both of the proposed new 
links is already present), the new graph G' is discarded 
and G is kept. Otherwise, G' is accepted. It is easily seen 
that this conserves the degree sequence, satisfies detailed 
balance, and is ergodic [14]. Thus it leads to equidistri- 
bution among all graphs with the degree sequence of the 
initial graph. 

Although there seem to exist no exact results on the 
speed of equilibration, previous experience [141 115] sug- 
gests that the above unbiased rewiring is very fast indeed, 
and can be used efficiently even for large networks. 
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2. Biased Rewiring 



For biased rewiring with a Hamiltonian H(G), the pro- 
posal stage is the same, and only the acceptance step has 
to be modified, according to the standard Metropolis- 
Hastings procedure [H[27]: If H ( G ') ^ H ( G )> then G ' 
is accepted (unless it has a double link, of course) . Oth- 
erwise the swap is accepted only with a probability 



p = e H(G)-H(G') 



(9) 



which is less than 1. 

The detailed protocols for simulating the two biased 
models studied in this paper are different. For the BRM 
we start with the actual network Go whose degree se- 
quence we want to use, and propose first M unbiased 
swaps, with Mq sufficiently large so that we end up in 
the typical region of the unbiased ensemble. After that 
we increase (3 in small steps (typically 8(3 = 0.002), start- 
ing with (3 = 0. After each step in (3 we propose Mi 
swaps to equilibrate approximately, and then take take 
at fixed f3 an ensemble average (with further equilibra- 
tion) by making m measurements, each separated by M2 
additional proposed swaps. Thus the total number of 
proposed swaps at each fixed (3 is M\ + (m — \)M2- 
Typically, M w f0 6 ,Mi > 10 5 ,M 2 w f0 3 - 10 5 , and 
771 w 500 - 10,000. 

Following the m measurements we increase (3 and re- 
peat this procedure, until a preset maximal value /3 max is 
reached. After that, we reverse the sign of 8(3 and con- 
tinue with the same parameters M\ , M 2 , and m until we 
reach again (3 — 0, thereby forming a hysteresis loop. Fu- 
gacity values during the ascending part of the loop will in 
the following be denoted by (3 + , those in the descending 
part as (3~ . In cases where we start from a real world 
network with tta,o triangles, we choose /3 max sufficiently 
large so that ha((3 + ) > tta,o, i- e - the clustering covered 
by the hysteresis loop includes the clustering coefficient 
of the original network. 

For the biased model of Milo et al. [12] we skip the 
first stage (i.e., we set M = 0), and we jump imme- 
diately to a value of (3 (estimated through preliminary 
runs) which must be larger than the smallest (3 + which 
gave rise to tt-a.o triangles in the ascending part of the 
loop discussed above. We first make M\ swaps to equi- 
librate, and then make m measurements, each separated 
by M2 further swaps (an alternative protocol using mul- 
tiple annealing periods will be discussed in Sec. HID I. 



Averages are taken only over configurations with exactly 
fiA,o triangles. If (3 is too small, the bias will not be suf- 
ficient to keep tta near ha.Oi and tia will drift to smaller 
values. Even if this is not the case and if (3 is sufficiently 
large in principle, the algorithm will slow down if (3 is 
near its lower limit, since then tta will seldom hit its tar- 
get value. On the other hand, if (3 is too large then the 
algorithm resembles an algorithm with rigid constraint, 
which usually leads to increased relaxation times. Thus 
choosing an optimal (3 is somewhat delicate in this model. 
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FIG. 1: (color online) Average number of triangles for net- 
works with fixed k = 3, plotted against (3. All curves are 
obtained by full hysteresis cycles, with Mi = 200000 initial 
swaps after increase/decrease of (3, and M2 = 5000 additional 
swaps after each of m — 40000 measurements at the same 
value of (3. Hysteresis loops are seen for TV > 200, but not for 
< 100. The straight line corresponds to the approximation 
Eq. ill I. 



III. RESULTS 

We explored the behavior of the BRM for three dif- 
ferent classes of degree sequences: Fixed k networks, in 
which every node of the network is degree k; Poisson de- 
gree distributions as in Erdos-Renyi networks; and typi- 
cal fat-tailed distributions as in most empirical networks. 
Although we studied many more cases (Erdos-Renyi net- 
works with different connectivities and sizes and sev- 
eral different protein-protein interaction networks), we 
present here only results for fixed k with different fc, 
for one Erdos-Renyi network, and for two empirical net- 
works with fat-tailed degree distributions: A high energy 
physics collaboration network |28j and a protein-protein 
interaction network for yeast (S. cerevisiae) [29]). In all 
but fixed k networks we found multiple discontinuous 
phase transitions, while we found a single phase tran- 
sition in all fixed k networks with k > 2. 



Fixed k networks, analytic and simulation 
results 



1. Fixed k simulations 

For each k, the configuration with maximal tta is a 
disjoint set of (k+ 1)— cliques, i.e. the graph decomposes 
into disjoint completely connected components of k + 1 
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nodes. When N is divisible by k + 1, this gives 



N 



k + 1 
3 



(10) 



For k = 2, this n 



(fc,max) 



is reached in a smooth way. 



For each k > 3, in contrast, and for sufficiently large 
N, riA first increases proportional to exp(/3), but then 
the increase accelerates and finally it jumps in a discrete 

step to a value very close to n^' m&x \ This is illustrated 
for k = 3 in Fig. [I] where we plot hysteresis curves for n/\ 
against /3. From this and from similar plots for different 
k we observe the following features: 



• For small [3, all curves are roughly described by 



of* 



G 



(11) 



(see the straight line in Fig. [lj, and this approxi- 
mation seems to become exact as N — > oo. Notice 
that this implies that tia is independent of N, and 
the clustering coefficient is proportional to 1/N. 

• While the curves are smooth and do not show hys- 
teresis for small N, they show both jumps and hys- 
teresis above a k— dependent value of N. This is 
our best indication that the phenomenon is basi- 
cally a first order phase transition, similar to the 
one in the Strauss model. Above the jump, the 
curves saturate (within the resolution of the plot) 
the bound given in Eq.(10l. 



• The critical values of [3 increase logarithmically 
with N , although a precise determination is dif- 
ficult due to the hysteresis. Notice that size depen- 
dent critical points are not very common, but there 
are some well known examples. Maybe the most 
important ones are models with long range or mean 
field type interactions, where the number of inter- 
action terms increases faster than TV. In the present 
case the reason for the logarithmic increase of (3 C is 
that networks with fixed k become more and more 
sparse as N increases. Thus also the density of 
triangles (the clustering coefficient) decreases, and 
in a Markov chain MC method, there are increas- 
ingly more proposed moves which destroy triangles 
than moves which create them. To compensate for 
this and make the number of accepted moves equal, 
exp(/3 c ) has to increase cx N. 

In Fig. [2] we show the average number of triangles 
as a function of (3 for fixed k networks, k = 2,3,5, 10, 
and 16, with N — 400 nodes. For each curve we used 
Mi = 4000000 initial swaps after each increase in (3, and 
M 2 = 200000 additional swaps after each of m > 5000 
measurements at the same value of (3. For clarity we 
show only values for increasing f3, although there is strong 
hysteresis for all k > 3 and for N = 400. 

For k = 2 there is not only no hysteresis, but there 
is indeed no indication of any phase transition. As seen 




FIG. 2: Average number of triangles of fixed-fc degree se- 
quence networks, with k = 2,3, 5, 10, and 16, versus the fu- 
gacity (bias) j3. Network size is N = 400 for all curves. In 
these simulations f3 was slowly increased, until a jump in tia 
was seen (for k > 3). The straight line shows the theoretical 
prediction for k = 2: tia = \e li '. The inset shows n/s/e 13 for 
k = 2. 



from the inset, the data for k — 2 are for all values of (3 
very well described by Eq. (Ill, up to the point where it 
reaches the bound Eq.(10). Close to that point there is 



a tiny bump in the curve shown in the inset, that will be 
explained in the next sub-sub-section. 



2. k = 2 analytic results 



We now give an analytical derivation of Eq. (Ill for 
k = 2, and we also show that this should become exact 
in the limit N —> oo. 

In a fixed k = 2 network, there are N nodes and N 
links all arranged in a set of disjoint simple loops. Trian- 
gles are the smallest possible loops, since self-links and 
double links are not allowed. For large N and small (3 
nearly all loops are large, thus the number of loops of 
length < 7 is of order 1/N and can be neglected for 
N — > oo and finite (3, except that we have to allow for 
a small fraction of loops to have length 3, in order to 
achieve equilibration of the rewiring procedure. 

Consider now a network of size N with n\ triangles 
and a triangle bias (3. The rewiring process will reach an 
equilibrium, when the probability of destroying a triangle 
is equal to the probability of creating a new one. 

First we calculate the probabilities of randomly gener- 
ating a swap which destroys a triangle. The total number 
of ways to choose a pair of links and perform a swap is Af 
— N ( N ^~ 1 ') x 2, where JV ^~ 1 - ) gives the number of distinct 
pairs of links and the extra factor of 2 accounts for the 
two possible ways of swapping the links. To destroy a 
triangle, one of the links must be chosen from it, and the 
other from a larger loop (the chance that both links are 
chosen from triangles, which would lead to the destruc- 
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tion of both, can be neglected). There are 3nA possible 
links in triangles to choose from, and (N — 3zja) links 
in larger loops. Thus the probability of choosing a swap 
which would destroy a triangle is 



Network properties 



PA- 



3n A (N-3n A )x2 6n A 



x[l+0(N~% (12) 



TV N 

where the factor of 2 in the numerator corresponds to 
the fact that both possible swaps destroy a triangle and 
the correction term takes also into account the neglected 
loops of lengths 4,5, and 6. 

To add a triangle to the network, two links must be 
chosen from the same long loop. They must be separated 
by exactly two links. There are I such pairs in a loop of 
length £, and thus the total number of such pairs in the 
network is N, neglecting terms of O(l), corresponding 
to the triangles and loops shorter than 7. This leaves us 
with the probability of adding a triangle 

N 
Jf 



PA+ = - = N- 1 x[l + 0{N- 1 )]. 



(13) 



where there is no factor of 2 in the numerator because 
only one of the two possible swaps will lead to triangle 
creation. Balance will be achieved when 



PA+ = e p pa- 



givmg 



G 



(14) 



(15) 



up to correction terms of order 1 /N, which is just Eq. (11) 
for k = 2. 

The simple exponential behavior of tia with (3 oc- 
curs because swaps create/destroy triangles (except in 
the rare case of breaking up a loop of length 6) inde- 
pendently and one at a time. For networks with nodes 
of degree greater than 2 this is still basically true when 
(3 is small. But as j3 increases, nodes cluster together 
more densely, allowing each link to participate in many 
triangles. For large values of f3 these links, once formed, 
become difficult to remove from the network. This coop- 
erativity - in which the presence of triangles helps other 
triangles to form and makes it harder for them to be re- 
moved - explains intuitively the existence of first order 
phase transitions for k > 3 but not for k = 2, where the 
cooperative effect is not possible. 

Indeed, for tia very close to n^' max ' ) there is some 



cooperativity even for k = 2. 

(2, max) 



The configuration with 



riA = n^' ' can be changed only by breaking up two 



triangles and joining their links in a loop of length 6. 

When tia is close to n^' m , link swaps which involve 
two triangles become increasingly prevalent. The ten- 
dency to form and destroy triangles two at a time in- 
troduces a very weak cooperativity, which is only strong 
enough to be effective when n A 2 ' max ' 1 — tia = 0(1). It is 
thus not enough to give rise to a phase transition, but it 
explains the small bump seen in the inset of Fig. [2] 



Network N (k) C 



Q Comment and Ref. 



ER 800 5.0 .002 
HEP 7610 4.1 .33 
Yeast 1373 10.0 .58 



.0004 0.196 Erdos-Renyi 

.29 0.397 scientific collab. [28] 

.58 0.380 protein binding [29] 



TABLE I: The number of nodes N, the number of links L, 
the average degree (k), the clustering coefficient C, and the 
assortativity r for each of the networks discussed in Sec. |III B"| 
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FIG. 3: Average number of triangles in BRM networks with 
an ER degree sequence with 800 nodes and (k) = 5, plotted 
against the bias j3. The lower curve corresponds to slowly 
increasing /3, the upper to decreasing /3. 



B. Networks with non-trivial degree sequences 

We explored the behavior of our biased rewiring model 
for various degree sequences. These included Erdos- 
Renyi graphs with different sizes and different connec- 
tivities and several real- world networks. The latter typi- 
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FIG. 4: Similar to Fig. [3] but for the HEP network (see 
Table m|. The dotted line indicates the number of triangles in 
the real network. 
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FIG. 5: Similar to Fig. |3j but for the Yeast network (see 
Table 0. 
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FIG. 6: (color online) Four network characteristics (modu- 
larity (Q), clustering coefficient (C3), 4-clique clustering co- 
efficient (C4), and assortativity (r)) for BRM networks with 
the Yeast degree sequence of Table [I] versus (3. These data are 
drawn from the same simulation as in Fig. [5j but for clarity 
only the results for increasing values of f3 are shown. 



cally show more or less fat tails. In order to find any de- 
pendence on the fatness, we also changed some of the se- 
quences manually in order to reduce or enhance the tails. 
We found no significant systematic effects beyond those 
visible already from the following three typical networks, 
and restrict our discussion in the following to these: an 
Erdos-Renyi graph [TU] (henceforth ER), a high energy 
physics collaboration network (HEP) [21], and a yeast 
protein binding network ((29] (Yeast). Some of their 
properties are collected in Table |T] 

Figs. [3] [4] and [5] show for these three networks. 
In each case M = 10 6 ,A/i = 1.5 x 10 5 ,M 2 = 50000, 
and m — 500. In each of them a full hysteresis cycle is 
shown, with the lower curves (labeled /3 + ) corresponding 
to increasing and the upper curves (/3 _ ) corresponding 
to decreasing (3. In Figs. [4] and [5] the dotted line shows 
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FIG. 7: (color online) Values of the rescaled characteristics 
nA/nA,max and (r - r min )/(r max - r min ), measured at the 
same values of j3 , and plotted against each other. The points 
represent the values for the real HEP and Yeast networks. 



the number of triangles in the empirical networks. 

For small values of (3 + all three figures exhibit a sim- 
ilar exponential increase in the number of triangles as 
that observed in fixed k networks. At different values 
of (3, however, there is a sudden, dramatic increase in 
fiA, which does not, however, lead to saturation as it did 
for fixed k. This first phase transition is followed by a 
series of further transitions through which the network 
becomes more and more clustered. Many of them are 
comparable in absolute magnitude to the first jump. Al- 
though the rough positions of the jumps depend only on 
the degree sequences, their precise positions and heights 
change slightly with the random number sequences used 
and with the speed with which (3 is increased. Thus 
the precise sequence of jumps has presumably no deeper 
significance, but their existence and general appearance 
seems to be a universal feature found in all cases. 

Associated with the jumps in are jumps in all other 
network characteristics we looked at, see Fig. [6] Al- 
though the locations of the jumps in tia depend slightly 
on the details of the simulation, the jumps in the other 
characteristics occur always at exactly the same positions 
as those in n&. Obviously, at each jump a significant 
re-structuring of the network occurs, which affects all 
measurable quantities. Speculations how these reorga- 
nizations can be best described and what is their most 
"natural" driving mechanism will be given in the next 
subsection. 

In the downward branch of the hysteresis loop, as (3~ 
decreases toward zero, the number of triangles remains 
high for a long time, forming a significant hysteresis loop. 
This loop suggests that all jumps should be seen as dis- 
continuous (first order) phase transitions. Since all stud- 
ied systems are finite, these hysteresis loops would of 
course disappear for infinitely slow increase/decrease of 
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the bias. But the sampling shown involved > 25 million 
attempted swaps at each value of /?, and no systematic 
change in the hysteresis was seen when compared to twice 
as fast sweeps. 

In Fig. [7] we plotted ua against the assortativity for 
the same values of f3 ± , normalizing both quantities to the 
unit interval. The hope was that in this way we would get 
universal curves which are the same for /3 + and j3~ , and 
maybe even across different networks. Indeed we see a 
quite remarkable data collapse. It is certainly not perfect, 
but definitely better than pure chance. It suggests that 
biasing with the BRM leads to networks where the two 
characteristics n A /n AiinliK and (r-r min )/(r max -r min ) are 
strongly - but non-linearly - correlated. This indicates 
a potential scaling relationship between these network 
parameters in our model. For the two empirical networks, 
we show also the real values of these characteristics. They 
fall far from the common curve, indicating that these 
networks are not typical for the BRM with any value of 
p. 

Among the three networks studied here, the ER net- 
work is closest to a fixed k network, and it should thus 
show behavior closest to that studied in the last subsec- 
tion. This is not very evident from Figs. [3] to [5] On 
the other hand, we see clearly from these figures that the 
position of the first transition - in particular in [3 + - de- 
creases with the average degree. Also, hysteresis seems 
to be more closely tied to individual jumps for ER, while 
it is more global (and thus also more important overall) 
for HEP and Yeast. 

For the HEP and Yeast networks, we can compare the 
clustering of the BRM ensemble to that in the real empir- 
ical networks. The latter numbers are shown as a dashed 
lines in Figs. [4] and [5] In both cases, the line intersects 
the hysteresis loop where it is very broad. This means 
that a large value of (3 + is required to reach the real 
network's level of clustering when the bias is increased, 
whereas a much lower value (3~ must be reached before 
these triangles can be rewired out of the network again. 
This gap between (3 + and (3~ at fixed n A has important 
implications for the triangle "conserving" null model of 
Rcf. I2|, as we will discuss later. 



C. Clique adjacency plots and clustering cores 

Up to now we have not given any intuitive arguments 
why clustering seems to increase in several jumps, and 
not in one single jump or in a continuous way. A pri- 
ori one might suggest that each jump is related to the 
break-up of a connected component into disconnected 
subgraphs, just as the phase transition in regular graphs 
was associated to such a break-up. By counting the num- 
bers of disconnected components we found that this is not 
the case, except in special cases [38] . 

Instead, we will now argue that each jump is associated 
with the sudden formation of a highly connected cluster 
of high degree nodes. The first jump in a scan with in- 



creasing P occurs when some of the strongest hubs link 
among themselves, forming a highly connected cluster. 
Subsequent jumps indicate the formations of other clus- 
ters with high intra- but low inter-connections. What 
distinguishes this picture from the standard modularity 
observed in many real-world networks is that it automat- 
ically leads to large assortativity: Since it is high degree 
nodes which form the first cluster (s), there is a strong 
tendency that clusters contain nodes with similar degrees 
(for previous discussions on how clustering of nodes de- 
pends on their degree, see e.g. [201 EH])- Even though the 
modules formed are somewhat atypical, the BRM does 
demonstrate the ability of a bias for triangle formation 
to give rise to community structure de novo, whereas in 
other models, community structure must be put in by 
hand [2]. 

In the following, the clusters of tightly connected nodes 
created by the BRM are called clustering cores. To visu- 
alize them, we use what we call q-clique adjacency plots 
(qCAPs) in the following. A q-clique adjacency plot is 
based on an integer- valued N x N matrix called the 
q-clique adjacency matrix. It is defined as T^- = when 
there is no link between i and j, and otherwise as the 
number of g-cliques which this link is part of. In other 
words, if q — 3, T^~ 3 is non-zero only when i and j are 
connected, and in the 3CAP case it counts the number 
of common neighbors. T?- can be considered a proxim- 
ity measure for nodes: linked nodes with many common 
neighbors are likely to belong to the same community. 
Similar proximity measures between nodes which depend 
on the similarity of their neighborhoods have been used 
in [33J To visualize TL-, we first rank the nodes 
and then plot for each pair of ranks a pixel with corre- 
sponding color or gray scale. Possible ranking schemes 
are by degree, by the number of triangles attached to 
the node, or by achieving the most simple looking, block 
diagonalized, q-clique adjacency. 

Examples for the Yeast degree sequence are given in 
Fig. [8} The four rows, descending from the top, show the 
3CAP for typical members of the BRM ensemble before 
the first jump and after the first, second, and third jumps. 
The plots in the left column show the ranking done by 
the degrees of the nodes. The plots on the right show 
the same matrices after "diagonalization" , with the nodes 
forming the first cluster placed in the top ranks, followed 
by the nodes forming the second cluster, and the nodes 
forming the third cluster. Only the relevant parts of the 
3CAPs are shown: nodes with lower ranks do not play 
any substantial role except for very large values of (3. We 
notice several features: 

• Not all highest degree nodes participate in the first 
clustering cores. Obviously, the selection of par- 
ticipating nodes is to some degree random, and 
when sufficiently many links are established they 
are frozen and cannot be changed easily later. This 
agrees with our previous observation that the posi- 
tions of the jumps change unsystematically with de- 
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FIG. 8: (color online) Relevant parts of 3-clique adjacency plots for the Yeast degree sequence. The color of each point 
indicates the number of 3cliques (or triangles) in which the link participates, as given by the scale on the right hand side. Each 
pair of plots shows (from top to bottom) the 3CAP for a typical member of the ensemble shortly before the first jump seen in 
Fig. [5] shortly after it, shortly after the second jump, and shortly after the third jump. The plots on the left hand side show 
the 3CAP with the nodes ranked in order of their degree. In the "diagonalized" plots we rearranged the ranking so that nodes 
which participate in the three clusters formed by each jump are ranked together, at the head of the list. The rest of the nodes 
are ranked by degree. 



tails like the random number sequence or the speed 
with which [3 is increased. 

• Clustering cores that have been formed once are not 
modified when [3 is further increased. Again this 



indicates that existing cores are essentially frozen. 

• Clustering cores corresponding to different steps do 
not overlap. 



10 



All three points are in perfect agreement with our pre- 
vious finding that hysteresis effects are strong and that 
structures which have been formed once are preserved 
when [3 is increased further. 

From other examples (and from later jumps for the 
same Yeast sequence) we know that the last two items 
in the list are not strictly correct in general, although 
changes of cores and overlap with previous cores do not 
occur often. Thus the results in Fig. [8] are too extreme 
to be typical. When a clustering core is formed, most of 
the links connected to these nodes will be saturated, and 
the few links left over will not have a big effect on the 
further evolution of the core. 

We find that as (3~ decreases, the clustering cores per- 
sist well below the value of (3 + at which they were created 
(not shown here) . This shows again that once a link par- 
ticipates in a large number of triangles, it is very stable 
and unlikely to be removed again. 

3-clique adjacency plots are also useful for analyz- 
ing empirical networks, independent of any rewiring null 
model, to help visualize community structure. While 
nodes in different communities often are linked, these 
links between communities usually take part in fewer 
triangles than links within communities. Thus simply 
replacing the standard adjacency matrix by the 3cliquc 
adjacency matrix should help discover and highlight com- 
munity structure (32j [33j [34] . 

In the top left panel of Figs [9] and [10] we show parts 
of the 3CAPs for the yeast protein-protein interaction 
and HEP networks respectively. In both cases, nodes are 
ranked by degree. We see that the triangles are mostly 
formed between strong hubs, as we should have expected. 
But clustering in the real networks does not strictly fol- 
low the degree pattern, in the sense that some of the 
strongest hubs are not members of prominent clusters. 
This shows again that real networks often have features 
which are not encoded in their degree sequence, and that 
a null model entirely based on the latter will probably fail 
to reproduce these features. We see also that links typi- 
cally participate in many triangles, if they participate in 
at least one. This is in contrast to a recently proposed 
clustering model, which assumes that each link can only 
participate in a single triangle |35j . 



D. Triangle conserving null models 



2(1(1 



empirical 




conserving 



40 



i i i : 5 : Si"ii 1 

JVi'b^iJ' 



node % 
200 



after first anneal/quench 



100 



annealing^ quenching 



20 



.—r..s.-.-.. .r .■ ... 



100 



200 , 
node j 




2(IU 



FIG. 9: (color online) Parts of 3CAPs for the real yeast 
protein-protein interaction network of [2Sj, for a typical net- 
work of the "triangle conserving" ensemble with no annealing, 
for a network obtained after an "annealing" period with /3 = 
and a subsequent quench with (5 7^ using 'triangle conserv- 
ing" rewirings |12) . and for an ensemble obtained by 500 of 
such annealing/quenching alternations. 
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FIG. 10: (color online) Analogous to Fig. [9j but for the real 
high energy physics collaboration network and for the HEP 
degree sequence, respectively. 



In the previous subsection we considered the case 
where the bias is "unidirectional" . In contrast to this, 
Milo et al. [H] considered the case where the bias tends 
to increase the number of triangles when it is below a 
number rJA,o ; but pushes it down when it is above. In this 
way one neither encounters any of the jumps discussed 
above nor any hysteresis. But that does not mean that 
the method is not plagued by the same basic problem, 
i.e. extreme sluggish dynamics and effectively broken cr- 
godicity. 

In the most straightforward implementation of trian- 



gle conserving rewiring with the Hamiltonian H^iio [12] 
one first estimates during preliminary runs a value of (3 
which is sufficiently large so that n& fluctuates around 
"A,o- Then one starts with the original true network 
and rewires it using this (3, without first 'annealing' it to 
(3 = 0. The effect of this is seen in the top right pan- 
els of Figs. [9] and 10 In both cases, the 3CAPs shown 
were obtained after > 10 9 attempted swaps. At (3 = 0, 
this number would have been much more than enough to 
equilibrize the ensemble. But for the large values of j3 
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far from the values for the real network. Thus the ensem- 
ble is a poor model for the real HEP network. We also 
see from Fig. 
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FIG. 11: (color online) Values of the assortativity r and of the 
number of 4-cliques in the real HEP network and in 400 mem- 
bers of the triangle-conserving biased ensemble. These 400 re- 
alizations were obtained approximately by 200 anneal/quench 
cycles with j3 = 2.3 and 200 cycles with j3 = 2.5, as described 
in the text. Notice that the results for biased simulations 
should become more exact as /3 decreases towards (3 C < 2. 



needed for these plots ((3 — 1.5 for Yeast, and (3 = 2.4 
for HEP), few changes from the initial configurations are 
seen. This is particularly true for the strongest clusters 
existing in the real networks. Triangles not taking part 
in these clusters change more rapidly, but are also less 
important. 

Thus we see a pitfall inherent in triangle conserving 
rewiring: when the bias is strong enough to push the 
number of triangles in the network up to the desired tar- 
get number, the bias will also be large enough that links 
between high degree nodes are hardly ever randomized. 

As a way out of this dilemma, we can alternate epochs 
where we use triangle conserving swaps with "annealing 
periods" where we use (3 = 0. In this way we would 
guarantee that memory is wiped out during each anneal- 
ing period (see the lower left panels in Figs. [9] and 10 1, 
and each "quenching epoch" would thus contribute es- 
sentially one independent configuration to the ensemble. 
After many such cycles we would obtain an ensemble 
which looks much more evenly sampled (lower right pan- 
els in Figs. [9] and 10 1, although even then we can not be 
sure that it really represents the equilibrium ensemble 
for the Hamiltonian H^iio- Apart from the last caveat, 
the method would presumably be too slow for practical 
applications where high accuracy and precise variances 
of ensemble observables are needed, since one needs one 
entire cycle per data point. But it can be useful in cases 
where it is sufficient to estimate fluctuations roughly, and 
where high precision is not an issue. To illustrate this, we 
present in Fig. [TT] results for the HEP network where we 
made 200 anneal/quench cycles for two different values 
of (3 {(3 — 2.3 and (3 = 2.5). In each cycle the quenching 
was stopped when the number of triangles reached the 
value of the real network, and the values of r and of the 
number of 4-cliques was recorded. We see from Fig. [TT] 
that these values scatter considerably, but are in all cases 



that r and «4_ c iiquc depend slightly on 
(3 (as was expected), but not so much as to invalidate the 
above conclusion. 



IV. CONCLUSION 

In highly clustered networks - and that means for most 
real world networks - most of the clustering is concen- 
trated amongst the highest degree nodes. The Strauss 
model correctly pointed to an important feature: clus- 
tering tends to be cooperative. Once many triangles are 
formed in a certain part of the network, they help in 
forming even more. Thus, clustering cannot be smoothly 
and evenly introduced into a network; it is often driven 
by densely interconnected, high-degree regions of the net- 
work. In triangle biased methods these high-degree re- 
gions can emerge quite suddenly and thereafter prove 
quite resistant to subsequent randomization. 

The biased rewiring model studied in the present pa- 
per is of exponential type, similar to the Strauss model, 
with the density of triangles controlled by a 'fugacity' 
or inverse 'temperature' (3. However, we prevent the 
catastrophic increase of connectivity at the phase tran- 
sition of the Strauss model by imposing a fixed degree 
sequence. Yet there is still a first order transition for ho- 
mogeneous networks, i.e. those with fixed degree. In the 
phase with strong clustering (large fugacity / low tem- 
perature), the configuration is basically a collection of 
disjoint k— cliques. 

If the degree sequence is not trivial, the formation of 
clustering cores can no longer happen at the same (3 for 
different parts of the network. Thus the single phase 
transition is replaced by a sequence of discrete and dis- 
continuous jumps, which resemble both first order transi- 
tions and Barkhausen jumps. As in the real Barkhausen 
phenomenon, frozen randomness is crucial for the mul- 
tiplicity of jumps. There, each jump corresponds to a 
flip of a spin cluster already defined by the randomness 
- at least at zero temperature [201 121] ■ In the present 
case, however, each jump corresponds to the creation of 
a cluster whose detailed properties are not fixed by the 
quenched randomness (the degree sequence), but depend 
also on the 'thermal' (non-quenched) noise. 

As in any first order phase transition, our model shows 
strong hysteresis. Clustering cores, once formed, are ex- 
tremely stable and cannot be broken up easily later. This 
limits its usefulness as a null model, even if it is treated 
numerically such that the phase transition jumps do not 
appear explicitly, as in the version of [12]. Because of the 
very slow time scales involved, Monte Carlo methods can- 
not sample evenly from these ensembles. Care should be 
taken to demonstrate that results found using them are 
broadly consistent across various sampling procedures. 

The spontaneous emergence of clustering cores in the 
BRM does suggest that triangle bias can give rise to com- 
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munity structure in networks, without the need to define 
communities a priori, thanks to the cooperativity of tri- 
angle formation. 

Together with jumps in the number of triangles (i.e. 
in the clustering coefficient), there are also jumps in all 
other network properties at the same control parameter 
positions. In particular, we found jumps in the num- 
ber of fc— cliques with k > 3, in the modularity, and in 
the assortativity. This immediately raises the question 
whether the model can be generalized so that a different 
fugacity is associated to each of these quantities. For as- 
sortativity, this was proposed some time ago by Newman 
[7]. With the present notation, biased rewiring models 
with and without target triangle number rtA.o an d target 
assortativity ro are given by the Hamiltonians 

#Miio(G;/?, 7 ) =/3|n A (G)-n A , |+ 7 |r(G)-r | (16) 

and 

H BRM (G;/?,7) = -/3n A (G)-7r(G), (17) 

respectively, where 7 is the fugacity associated to the as- 
sortativity. It is an interesting open question whether 
such a model might lead to less extreme clustering and 
thus might be more realistic. First simulations |39j indi- 
cate that driving assortativity leads to smooth increases 



of all other quantities without jumps. The reason for 
that seems to be that the basic mechanism leading to in- 
creased assortativity - the replacement of existing links 
by links between similar nodes - is not cooperative, but 
further studies are needed. 



As Newman remarked in |35j . clustering in networks 
"has proved difficult to model mathematically." In that 
paper he introduced a model where each link can par- 
ticipate in one triangle at most. In this way, the phase 
transitions seen in the Strauss model and in the present 
model are avoided. However, in the real-world networks 
studied here we found that the number of triangles in 
which a link participates is broadly distributed, suggest- 
ing that the Newman model [35] may not be realistic for 
networks with significant clustering. Indeed, specifying 
for each link the number of triangles in which it partici- 
pates adds valuable information to the adjacency matrix 
(which just specifies whether the link exists or not). The 
resulting '3 clique adjacency plots' revealed structures 
which would not have been easy to visualize otherwise 
and are useful also in other contexts. Thus, in contrast 
to what is claimed in [35], the quest for realistic models 
for network clustering is not yet finished. 
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